Search Result

Select

k nearest neighbor query based on parallel ant colony algorithm in obstacle space

GUO Liangmin, ZHU Ying, SUN Liping

Journal of Computer Applications 2019, 39 (3): 790-795. DOI: 10.11772/j.issn.1001-9081.2018081647

Abstract （410）

PDF （932KB）（258）

Save

To solve the problem of k nearest neighbor query in obstacle space, a k nearest neighbor Query method based on improved Parallel Ant colony algorithm (PAQ) was proposed. Firstly, ant colonies with different kinds of pheromones were utilized to search k nearest neighbors in parallel. Secondly, a time factor was added as a condition of judging path length to directly show the searching time of ants. Thirdly, the concentration of initial pheromone was redefined to avoid the blind searching of ants. Finally, visible points were introduced to divide the obstacle path into multiple Euclidean paths, meawhile the heuristic function was improved and the visible points were selected by ants to conduct probability transfer making ants search in more proper direction and prevent the algorithm from falling into local optimum early. Compared to WithGrids method, with number of data points less than 300, the running time for line segment obstacle is averagely reduced by about 91.5%, and the running time for polygonal obstacle is averagely reduced by about 78.5%. The experimental results show that the running time of the proposed method has obvious advantage on small-scale data, and the method can process polygonal obstacles.

Reference | Related Articles | Metrics

Select

Density peaks clustering algorithm based on shared near neighbors similarity

BAO Shuting, SUN Liping, ZHENG Xiaoyao, GUO Liangmin

Journal of Computer Applications 2018, 38 (6): 1601-1607. DOI: 10.11772/j.issn.1001-9081.2017122898

Abstract （825）

PDF （1016KB）（430）

Save

Density peaks clustering is an efficient density-based clustering algorithm. However, it is sensitive to the global parameter d_c. Furthermore, artificial intervention is needed for decision graph to select clustering centers. To solve these problems, a new density peaks clustering algorithm based on shared near neighbors similarity was proposed. Firstly, the Euclidean distance and shared near neighbors similarity were combined to define the local density of a sample, which avoided the setting of parameter d_c of the original density peaks clustering algorithm. Secondly, the selection process of clustering centers was optimized to select initial clustering centers adaptively. Finally, each sample was assigned to the cluster as its nearest neighbor with higher density samples. The experimental results show that, compared with the original density peaks clustering algorithm on the UCI datasets and the artificial datasets, the average values of accuracy, Normalized Mutual Information (NMI) and F-Measure of the proposed algorithm are respectively increased by about 22.3%, 35.7% and 16.6%. The proposed algorithm can effectively improve the accuracy of clustering and the quality of clustering results.

Reference | Related Articles | Metrics

Select

Spectral clustering algorithm based on differential privacy protection

ZHENG Xiaoyao, CHEN Dongmei, LIU Yuqing, YOU Hao, WANG Xiangshun, SUN Liping

Journal of Computer Applications 2018, 38 (10): 2918-2922. DOI: 10.11772/j.issn.1001-9081.2018040888

Abstract （723）

PDF （753KB）（400）

Save

Aiming at the problem of privacy leakage in the application of traditional clustering algorithm, a spectral clustering algorithm based on differential privacy protection was proposed. Based on the differential privacy model, the cumulative distribution function was used to generate random noise that satisfies Laplasse distribution. Then the noise was added to the sample similarity function calculated by the spectral clustering algorithm, which disturbed the weight values between the individual samples and realized information hiding between sample individuals for privacy protection. Experimental results of UCI dataset verify that the proposed algorithm can achieve effective data clustering within a certain degree of information loss, and can also protect clustered data.

Reference | Related Articles | Metrics

Select

Enterprise abbreviation prediction based on constitution pattern and conditional random field

SUN Liping, GUO Yi, TANG Wenwu, XU Yongbin

Journal of Computer Applications 2016, 36 (2): 449-454. DOI: 10.11772/j.issn.1001-9081.2016.02.0449

Abstract （795）

PDF （990KB）（1004）

Save

With the continuous development of enterprise marketing, the enterprise abbreviation has been widely used. Nevertheless, as one of the main sources of unknown words, the enterprise abbreviation can not be effectively identified. A methodology on predicting enterprise abbreviation based on constitution pattern and Conditional Random Field (CRF) was proposed. First, the constitution patterns of enterprise name and abbreviation were summarized from the perspective of linguistics, and the Bi-gram algorithm was improved by a combination of lexicon and rules, namely CBi-gram. CBi-gram algorithm was used to realize the automatic segmentation of the enterprise name and improve the recognition accuracy of the company's core word. Then the enterprise type was subdivided by CBi-gram, and the abbreviation rule sets were collected by artificial summary and self-learning method to reduce noise caused by unsuitable rules. Besides, in order to make up the limitations of artificial building rules on abbreviations and mixed abbreviation, the CRF was introduced to generate enterprise abbreviation statistically, and word, tone and word position were used as characteristics to train model as supplementary. The experimental results show that the method exhibites a good performance and the output can fundamentally cover the usual range of enterprise abbreviations.

Reference | Related Articles | Metrics